Object Grasp Control of a 3D Robot Arm by Combining EOG Gaze Estimation and Camera-Based Object Recognition

The purpose of this paper is to quickly and stably achieve grasping objects with a 3D robot arm controlled by electrooculography (EOG) signals. A EOG signal is a biological signal generated when the eyeballs move, leading to gaze estimation. In conventional research, gaze estimation has been used to control a 3D robot arm for welfare purposes. However, it is known that the EOG signal loses some of the eye movement information when it travels through the skin, resulting in errors in EOG gaze estimation. Thus, EOG gaze estimation is difficult to point out the object accurately, and the object may not be appropriately grasped. Therefore, developing a methodology to compensate, for the lost information and increase spatial accuracy is important. This paper aims to realize highly accurate object grasping with a robot arm by combining EMG gaze estimation and the object recognition of camera image processing. The system consists of a robot arm, top and side cameras, a display showing the camera images, and an EOG measurement analyzer. The user manipulates the robot arm through the camera images, which can be switched, and the EOG gaze estimation can specify the object. In the beginning, the user gazes at the screen’s center position and then moves their eyes to gaze at the object to be grasped. After that, the proposed system recognizes the object in the camera image via image processing and grasps it using the object centroid. The object selection is based on the object centroid closest to the estimated gaze position within a certain distance (threshold), thus enabling highly accurate object grasping. The observed size of the object on the screen can differ depending on the camera installation and the screen display state. Therefore, it is crucial to set the distance threshold from the object centroid for object selection. The first experiment is conducted to clarify the distance error of the EOG gaze estimation in the proposed system configuration. As a result, it is confirmed that the range of the distance error is 1.8–3.0 cm. The second experiment is conducted to evaluate the performance of the object grasping by setting two thresholds from the first experimental results: the medium distance error value of 2 cm and the maximum distance error value of 3 cm. As a result, it is found that the grasping speed of the 3 cm threshold is 27% faster than that of the 2 cm threshold due to more stable object selection.


Introduction
In recent years, assistive devices for people with motor difficulties have attracted attention. However, the operation panels of the conventional assistive devices are based on finger operation. So, people with tetraplegia, such as amyotrophic lateral sclerosis (ALS) patients, are paralyzed from neck to toe and are unable to use them themselves. This can make it harder for them to take care of themselves or their own survival, as constant

Proposed System
To investigate real-time object grasping using EOG, a 3D robot workspace is implemented. Figure 1 shows the conceptual system. It consists of a robot arm with a gripper composed of four Dynamixel AX-12 servomotors (Robotis Co., Ltd., Seoul, Republic of Korea), two USB cameras for taking pictures from above and from the side of the robot arm, a bio-signal measurement device (two channels) for EOG signals, a PC for analyzing EOG signals (Microsoft Visual Studio 2017 C++ software), a display for showing the images from the USB cameras, and two grasping objects (one is a red cube and the other is a blue cube). The software consists of EOG estimation, object recognition for camera images, and robot arm control based on inverse kinematics. develop a system that can supplement errors in EOG gaze estimation and verify the performance.

Proposed System
To investigate real-time object grasping using EOG, a 3D robot workspace is implemented. Figure 1 shows the conceptual system. It consists of a robot arm with a gripper composed of four Dynamixel AX-12 servomotors (Robotis Co., Ltd., Seoul, Republic of Korea), two USB cameras for taking pictures from above and from the side of the robot arm, a bio-signal measurement device (two channels) for EOG signals, a PC for analyzing EOG signals (Microsoft Visual Studio 2017 C++ software), a display for showing the images from the USB cameras, and two grasping objects (one is a red cube and the other is a blue cube). The software consists of EOG estimation, object recognition for camera images, and robot arm control based on inverse kinematics.

EOG Measurement Method
The bio-signal measurement system consists of five disposable electrodes, two biosignal sensors, and a data acquisition device (National Instruments USB-6008), as shown in Figure 2. The electrodes are attached to the eye's periphery, and the horizontal eye movement can be measured by Ch1 signals, and the vertical eye movement can be measured by Ch2 signals. Figure 3 shows the configuration of the electrodes on the face. The bio-signal sensor (measurement circuit) is configured as a band-pass filter (low-cut frequency: 1.06 Hz, high-cut frequency: 4.97 Hz, and gain: 78 dB) to measure AC-EOG signals. The schematic of the sensor is shown in Figure 4. EOG can be divided into two types, DC-EOG and AC-EOG, depending on the band-pass setting values. The one used in this study is AC-EOG. Then, at a sampling rate of 2 kHz, the data acquisition device converts the voltage signal into digital data. The data are transmitted to a PC every two seconds to be handled as numerical values on a PC using Microsoft Visual Studio 2017 C++.

EOG Measurement Method
The bio-signal measurement system consists of five disposable electrodes, two biosignal sensors, and a data acquisition device (National Instruments USB-6008), as shown in Figure 2. The electrodes are attached to the eye's periphery, and the horizontal eye movement can be measured by Ch1 signals, and the vertical eye movement can be measured by Ch2 signals. Figure 3 shows the configuration of the electrodes on the face. The biosignal sensor (measurement circuit) is configured as a band-pass filter (low-cut frequency: 1.06 Hz, high-cut frequency: 4.97 Hz, and gain: 78 dB) to measure AC-EOG signals. The schematic of the sensor is shown in Figure 4. EOG can be divided into two types, DC-EOG and AC-EOG, depending on the band-pass setting values. The one used in this study is AC-EOG. Then, at a sampling rate of 2 kHz, the data acquisition device converts the voltage signal into digital data. The data are transmitted to a PC every two seconds to be handled as numerical values on a PC using Microsoft Visual Studio 2017 C++.    The EOG system is configured as two channels (Ch1 and Ch2). Based on Table 1, the system has a bandpass filter with a lower cut-off frequency of 1.06 Hz and an upper cutoff frequency of 4.97 Hz. This filter is used to convert the DC-EOG to AC-EOG. The gain for the system is 78 [db] with a sampling frequency of 1 kHz in the data acquisition device. Figures 5 and 6 show the Bode plots of the Ch1 and Ch2 amplifier circuits, respectively.   The EOG system is configured as two channels (Ch1 and Ch2). Based on Table 1, the system has a bandpass filter with a lower cut-off frequency of 1.06 Hz and an upper cutoff frequency of 4.97 Hz. This filter is used to convert the DC-EOG to AC-EOG. The gain for the system is 78 [db] with a sampling frequency of 1 kHz in the data acquisition device. Figures 5 and 6 show the Bode plots of the Ch1 and Ch2 amplifier circuits, respectively.    The EOG system is configured as two channels (Ch1 and Ch2). Based on Table 1, the system has a bandpass filter with a lower cut-off frequency of 1.06 Hz and an upper cutoff frequency of 4.97 Hz. This filter is used to convert the DC-EOG to AC-EOG. The gain for the system is 78 [db] with a sampling frequency of 1 kHz in the data acquisition device. Figures 5 and 6 show the Bode plots of the Ch1 and Ch2 amplifier circuits, respectively. The EOG system is configured as two channels (Ch1 and Ch2). Based on Table 1, the system has a bandpass filter with a lower cut-off frequency of 1.06 Hz and an upper cut-off frequency of 4.97 Hz. This filter is used to convert the DC-EOG to AC-EOG. The gain for the system is 78 [db] with a sampling frequency of 1 kHz in the data acquisition device. Figures 5 and 6 show the Bode plots of the Ch1 and Ch2 amplifier circuits, respectively. The upper figure is the gain diagram, which shows the change in gain magnitude with frequency, and the lower figure is the phase diagram, which shows the phase change with frequency. The amplification characteristics in the design are shown. These Bode plots characterize a bandpass filter with a lower cutoff frequency of 1.06 Hz and an upper cutoff frequency of 4.97 Hz. The upper figure is the gain diagram, which shows the change in gain magnitude with frequency, and the lower figure is the phase diagram, which shows the phase change with frequency. The amplification characteristics in the design are shown. These Bode plots characterize a bandpass filter with a lower cutoff frequency of 1.06 Hz and an upper cutoff frequency of 4.97 Hz.

EOG Gaze Estimation Method
The AC-EOG gaze position can be estimated by analyzing the positive and negative amplitudes and the area (integral value) of the two AC-EOG amplitudes every two seconds. For the direction of eye movement, positive and negative thresholds are set, and then the positive and negative states are determined, by which the threshold the EOG signal exceeds. The amplitude of the EOG Ch1 signal corresponds to horizontal eye movements, and it can be used to determine left and right movements. The amplitude of the EOG Ch2 signal corresponds to vertical eye movements and can be used to determine up and down movements. The amount of eye movement is calculated by the integral value of the AC-EOG amplitude: Ch1 is the vertical eye movement and Ch2 is the horizontal eye movement.
The visual point (x-coordinate from ch1 and y-coordinate from ch2) is estimated by combining these movement directions and amounts. In order to match the position of the displayed image with the estimated EOG gaze point, it is necessary to measure the two EOG signals when the user gazes at a specific point during calibration and establish an equal relationship between the EOG amplitude and the image size. The distance between

EOG Gaze Estimation Method
The AC-EOG gaze position can be estimated by analyzing the positive and negative amplitudes and the area (integral value) of the two AC-EOG amplitudes every two seconds. For the direction of eye movement, positive and negative thresholds are set, and then the positive and negative states are determined, by which the threshold the EOG signal exceeds. The amplitude of the EOG Ch1 signal corresponds to horizontal eye movements, and it can be used to determine left and right movements. The amplitude of the EOG Ch2 signal corresponds to vertical eye movements and can be used to determine up and down movements. The amount of eye movement is calculated by the integral value of the AC-EOG amplitude: Ch1 is the vertical eye movement and Ch2 is the horizontal eye movement.
The visual point (x-coordinate from ch1 and y-coordinate from ch2) is estimated by combining these movement directions and amounts. In order to match the position of the displayed image with the estimated EOG gaze point, it is necessary to measure the two EOG signals when the user gazes at a specific point during calibration and establish an equal relationship between the EOG amplitude and the image size. The distance between the user's eyes and the screen is fixed at 35.0 cm. Figures 7 and 8 show the eye gaze configuration and the EOG gaze estimation.

Integration Algorithm between Camera Object Recognition and EOG Gaze Estimation
Previous research has stated that an error occurs between the estimated EOG gaze position [8,9] and the true object position. Since the error is caused by the degradation of EOG signals and is essentially impossible to recover, using object recognition for camera images should be an effective method to compensate for this error. As an object recognition algorithm, the simplest object recognition of the HSV method is applied to recognize two objects (a red cube and a blue cube). The essential algorithm for recognition is shown in Figure 9. Since the main purpose of this paper is to verify the combination function of EOG gaze estimation and object recognition, the HSV method extracts red and blue color areas from camera images, and the areas are binarized into black and white, and the centroid of each object is calculated, as shown in Figure 10. Then, the distance

Integration Algorithm between Camera Object Recognition and EOG Gaze Estimation
Previous research has stated that an error occurs between the estimated EOG gaze position [8,9] and the true object position. Since the error is caused by the degradation of EOG signals and is essentially impossible to recover, using object recognition for camera images should be an effective method to compensate for this error. As an object recognition algorithm, the simplest object recognition of the HSV method is applied to recognize two objects (a red cube and a blue cube). The essential algorithm for recognition is shown in Figure 9. Since the main purpose of this paper is to verify the combination function of EOG gaze estimation and object recognition, the HSV method extracts red and blue color areas from camera images, and the areas are binarized into black and white, and the centroid of each object is calculated, as shown in Figure 10. Then, the distance

Integration Algorithm between Camera Object Recognition and EOG Gaze Estimation
Previous research has stated that an error occurs between the estimated EOG gaze position [8,9] and the true object position. Since the error is caused by the degradation of EOG signals and is essentially impossible to recover, using object recognition for camera images should be an effective method to compensate for this error. As an object recognition algorithm, the simplest object recognition of the HSV method is applied to recognize two objects (a red cube and a blue cube). The essential algorithm for recognition is shown in Figure 9. Since the main purpose of this paper is to verify the combination function of EOG gaze estimation and object recognition, the HSV method extracts red and blue color areas from camera images, and the areas are binarized into black and white, and the centroid of each object is calculated, as shown in Figure 10. Then, the distance between the estimated EOG gaze position and the centroid of each object is calculated. Finally, the object which has the smallest distance within a certain distance (threshold) is judged as the target object to be grasped.
between the estimated EOG gaze position and the centroid of each object is calculated. Finally, the object which has the smallest distance within a certain distance (threshold) is judged as the target object to be grasped.
On the other hand, if the distance error exceeds the threshold, the system will redo the EOG gaze estimation. As a supplementary note, object recognition is suitable for determining grasp strategies since it is capable of not only grasping the position of the object but also the posture of the object, which leads to an effective grasping process for objects with complex shapes. Since the appropriate threshold value largely depends on the measurement environment, Experiment 1 clarifies the constructed system's EOG gaze estimation error range. There are two types of threshold values: the medium error and the maximum error values. We verify them to discuss the appropriate threshold setting method.

Robot Arm Control Process
The robot arm with a gripper has a total of four degrees of freedom, as shown in Figure 11. Three degrees of freedom can control the robot limbs and one degree of freedom can control the gripper. The user can operate the 3D motion of the robot arm on two consecutive inputs of EOG gaze estimations for the top camera image (Figure 12: Up) and the side camera image (Figure 12: Down). The camera is automatically switched when the between the estimated EOG gaze position and the centroid of each object is calculated. Finally, the object which has the smallest distance within a certain distance (threshold) is judged as the target object to be grasped.
On the other hand, if the distance error exceeds the threshold, the system will redo the EOG gaze estimation. As a supplementary note, object recognition is suitable for determining grasp strategies since it is capable of not only grasping the position of the object but also the posture of the object, which leads to an effective grasping process for objects with complex shapes. Since the appropriate threshold value largely depends on the measurement environment, Experiment 1 clarifies the constructed system's EOG gaze estimation error range. There are two types of threshold values: the medium error and the maximum error values. We verify them to discuss the appropriate threshold setting method.

Robot Arm Control Process
The robot arm with a gripper has a total of four degrees of freedom, as shown in Figure 11. Three degrees of freedom can control the robot limbs and one degree of freedom can control the gripper. The user can operate the 3D motion of the robot arm on two consecutive inputs of EOG gaze estimations for the top camera image (Figure 12: Up) and the side camera image (Figure 12: Down). The camera is automatically switched when the On the other hand, if the distance error exceeds the threshold, the system will redo the EOG gaze estimation. As a supplementary note, object recognition is suitable for determining grasp strategies since it is capable of not only grasping the position of the object but also the posture of the object, which leads to an effective grasping process for objects with complex shapes. Since the appropriate threshold value largely depends on the measurement environment, Experiment 1 clarifies the constructed system's EOG gaze estimation error range. There are two types of threshold values: the medium error and the maximum error values. We verify them to discuss the appropriate threshold setting method.

Robot Arm Control Process
The robot arm with a gripper has a total of four degrees of freedom, as shown in Figure 11. Three degrees of freedom can control the robot limbs and one degree of freedom can control the gripper. The user can operate the 3D motion of the robot arm on two consecutive inputs of EOG gaze estimations for the top camera image (Figure 12: Up) and the side camera image (Figure 12: Down). The camera is automatically switched when the first EOG gaze estimation is completed. Then, when the object to be grasped is specified after the second EOG gaze estimation, inverse kinematics is performed using the object's centroid as an input value, as shown in Figure 13. This value calculates the motor angles of the three degrees of freedom of the arm, and the robot arm motion is executed afterward. first EOG gaze estimation is completed. Then, when the object to be grasped is specified after the second EOG gaze estimation, inverse kinematics is performed using the object's centroid as an input value, as shown in Figure 13. This value calculates the motor angles of the three degrees of freedom of the arm, and the robot arm motion is executed afterward.  first EOG gaze estimation is completed. Then, when the object to be grasped is specified after the second EOG gaze estimation, inverse kinematics is performed using the object's centroid as an input value, as shown in Figure 13. This value calculates the motor angles of the three degrees of freedom of the arm, and the robot arm motion is executed afterward.

Inverse Kinematics
Since the input of the actual machine is the joint angle, it is necessary to convert the target trajectory to the target angle of each joint. Since inverse kinematics is generally used for the trajectory generation of robot arms, inverse kinematics is derived below. Figure 13 shows the coordinate system set for the flexible manipulator. All coordinate systems are left-handed, and the reference coordinate system is set at the base of the manipulator. We assume that the origins of Σ through Σ overlap. l ，l are the distance between the axes of Joint2 and Joint3 and the distance from the axis of Joint3 to the tip, respectively. In addition, ， ， are the motion angles of Joint1, Joint2, and

Inverse Kinematics
Since the input of the actual machine is the joint angle, it is necessary to convert the target trajectory to the target angle of each joint. Since inverse kinematics is generally used for the trajectory generation of robot arms, inverse kinematics is derived below. Figure 13 shows the coordinate system set for the flexible manipulator. All coordinate systems are left-handed, and the reference coordinate system is set at the base of the manipulator. We assume that the origins of Σ through Σ overlap. l ，l are the distance between the axes of Joint2 and Joint3 and the distance from the axis of Joint3 to the tip, respectively. In addition, ， ， are the motion angles of Joint1, Joint2, and

Inverse Kinematics
Since the input of the actual machine is the joint angle, it is necessary to convert the target trajectory to the target angle of each joint. Since inverse kinematics is generally used for the trajectory generation of robot arms, inverse kinematics is derived below. Figure 13 shows the coordinate system set for the flexible manipulator. All coordinate systems are left-handed, and the reference coordinate system is set at the base of the manipulator. We assume that the origins of Σ 0 through Σ 1 overlap. l 1 , l 2 are the distance between the axes of Joint2 and Joint3 and the distance from the axis of Joint3 to the tip, respectively. In addition, θ 1 , θ 2 , θ 3 are the motion angles of Joint1, Joint2, and Joint3, respectively, and the upright state is 0[deg]. Normally, the posture of the robot arm should also be determined, but the manipulator used this time does not have enough degrees of freedom, so the tip posture is not considered.

Derivation of Inverse Kinematics
If the coordinate transformation matrix from Σ i−1 to Σ i is i−1 T i , each transformation coordinate matrix is as follows.
Using the above, the transformation matrix from the reference coordinate system to the tip coordinate system is given in Equation (9).
At this time, R represents the rotation coordinate and d represents the translation vector. In this study, we do not consider the rotating coordinate system. Only the translation vector d is used as a derivation of inverse kinematics. Again, the translation vector d is defined as follows.
From the x and y components of Equation (10), the following equation holds.
Therefore, θ 1 can be calculated as follows.
Furthermore, the square of the vector length of Equation (10) is Equation (13).
In Equation (21), there are two solutions because the manipulator can take two postures for any position ( Figure 14). Therefore, this time, Equation (22), which is closer to the initial posture, is used as the solution, where l 0 = 0.11 [m], l 1 = 0.092 [m], l 2 = 0.14 m

Experiment 1: The Investigation of the Distance Error of EOG Gaze Estimation
The first experiment verifies the distance error of EOG gaze estimation (i.e., the gap between the estimated EOG gaze position and the target object's centroid calculated from the camera object recognition). The procedure is as follows: first, the proposed camera object recognition algorithm calculates the centroids of red and blue cubes (height: 3.0 cm, length: 3.0 cm, and width: 3.0 cm), and then a hundred EOG gaze estimation trials are conducted for each of the two cubes, and the distance error is analyzed.
The results in Figure 15 indicate no significant difference in the distance error for the red and blue cubes, thus confirming that there is no effect of color and where the objects are positioned on the camera image. Overall, the distance error for the blue object has a mean value of 1.865 cm, a maximum value of 5.675 cm, and a minimum value of 0.171 cm, while the distance error for the red object has a mean value of 2.188 cm, a maximum value of 6.985 cm, and a minimum value of 0.1761 cm. It is confirmed that the distance error in our experimental setup ranges from 1.8 to 3.0 cm. Hence, Experiment 2 uses the intermediate distance error value of 2 cm and the maximum distance error value of 3 cm based on the above results. These values verify the appropriate threshold setting to perform with a higher speed and more stability in the proposed system. Figure 15. The distance errors of the EOG gaze estimations for blue and red cubes. Distance errors beyond the whiskers are displayed using red + marks.

Experiment 1: The Investigation of the Distance Error of EOG Gaze Estimation
The first experiment verifies the distance error of EOG gaze estimation (i.e., the gap between the estimated EOG gaze position and the target object's centroid calculated from the camera object recognition). The procedure is as follows: first, the proposed camera object recognition algorithm calculates the centroids of red and blue cubes (height: 3.0 cm, length: 3.0 cm, and width: 3.0 cm), and then a hundred EOG gaze estimation trials are conducted for each of the two cubes, and the distance error is analyzed.
The results in Figure 15 indicate no significant difference in the distance error for the red and blue cubes, thus confirming that there is no effect of color and where the objects are positioned on the camera image. Overall, the distance error for the blue object has a mean value of 1.865 cm, a maximum value of 5.675 cm, and a minimum value of 0.171 cm, while the distance error for the red object has a mean value of 2.188 cm, a maximum value of 6.985 cm, and a minimum value of 0.1761 cm. It is confirmed that the distance error in our experimental setup ranges from 1.8 to 3.0 cm. Hence, Experiment 2 uses the intermediate distance error value of 2 cm and the maximum distance error value of 3 cm based on the above results. These values verify the appropriate threshold setting to perform with a higher speed and more stability in the proposed system.

Experiment 1: The Investigation of the Distance Error of EOG Gaze Estimation
The first experiment verifies the distance error of EOG gaze estimation (i.e., the gap between the estimated EOG gaze position and the target object's centroid calculated from the camera object recognition). The procedure is as follows: first, the proposed camera object recognition algorithm calculates the centroids of red and blue cubes (height: 3.0 cm, length: 3.0 cm, and width: 3.0 cm), and then a hundred EOG gaze estimation trials are conducted for each of the two cubes, and the distance error is analyzed.
The results in Figure 15 indicate no significant difference in the distance error for the red and blue cubes, thus confirming that there is no effect of color and where the objects are positioned on the camera image. Overall, the distance error for the blue object has a mean value of 1.865 cm, a maximum value of 5.675 cm, and a minimum value of 0.171 cm, while the distance error for the red object has a mean value of 2.188 cm, a maximum value of 6.985 cm, and a minimum value of 0.1761 cm. It is confirmed that the distance error in our experimental setup ranges from 1.8 to 3.0 cm. Hence, Experiment 2 uses the intermediate distance error value of 2 cm and the maximum distance error value of 3 cm based on the above results. These values verify the appropriate threshold setting to perform with a higher speed and more stability in the proposed system.

Experiment 2: An Investigation of the Speed and Stability of the Proposed System
The second experiment evaluates the performance of the proposed system in terms of speed and stability for object grasping tasks at two different thresholds. The experiment is conducted on five subjects (age range: 20 to 40 years). The experiment conductor first explains the experiment's significance and procedure, attaches EOG electrodes to the subject, and confirms that the subject can appropriately operate the system.
In the experiment, the subject first needs to gaze at the top image's center position and then gaze at the object to be grasped in the top image. Next, the image is switched to the side image when the proposed system appropriately estimates the first EOG gaze position of the subject. Later, the subject needs to see the side image's center position and gaze at the object to be grasped in the side image, before the trial ends. Finally, the subject repeats eight times for each threshold setting; the first five trials are practice trials and the latter three trials are used for performance evaluation purposes.
As shown by the results in Table 2 and Figure 16, the average achievement time for threshold 2 cm is 71 s, while the average achievement time for threshold 3 cm is 52 s. This comparison means that the distance error is more appropriate for threshold 3 cm than for threshold 2 cm, resulting in a series of object-grasping operations with 27% faster speed. The narrow thresholds make the system more precise in selecting the object. However, the user needs undivided concentrations for eye gazing to perform the control smoothly.

Conclusions
This study aims to improve the object-grasping performance of 3D robot arm control based on EOG gaze estimation for people with tetraplegia. In conventional research, EOG gaze estimation is problematic in the sense that object grasping inconsistency fails due to errors and the inability to specify objects with high spatial accuracy. This paper introduces a new method of object recognition based on camera images to compensate for the EOG gaze estimation errors and achieve stable object grasping. The proposed system consists of two cameras installed on the top and side of the robot arm, which are viewed using a computer display. In the investigation experiment with five subjects, it is found that the error of EOG gaze estimation in the proposed system configuration ranges between 1.8 cm and 3.0 cm. Next, two threshold values based on EOG gaze estimation errors from the first experiment are set. Finally, an object-grasping experiment with a robot arm using a control input combining camera object recognition and EOG gaze estimation is conducted. As a result, when the threshold value is set to a medium error value of 2.0 cm, the robot arm fails to specify the object, taking 71 s on average to grasp the object. On the other hand, when the maximum error value of 3.0 cm is set as the threshold, the object selection is stable, and the average time is 52 s, i.e., the grasping time is 27% faster than the case where the medium error value is set as the threshold. In conclusion, it is proven that the introduction of camera object recognition can compensate for the error of the EOG gaze estimation and realize precise object grasping. In contrast, the conventional EOG gaze estimation has error values of 1.8 cm to 3.0 cm. There are advantages and disadvantages for the proposed system based on these results. The advantages of the system are as follows: (1) the system enables the remote controlling of a robot arm using the EOG method, (2) the implementation of the simple image processing method improves the EOG gaze estimation by targeting the center point of the object; and (3) the robot gripper is able to grab the target object successfully in order to conduct displacement tasks. As for the disadvantages, (1) the user is constrained to stay still and positioned 0.35 m from the computer display and (2) the condition of the captured image from camera can affect the accuracy of the image processing, e.g., the image brightness from room lighting and image pixel quality.

Conclusions
This study aims to improve the object-grasping performance of 3D robot arm control based on EOG gaze estimation for people with tetraplegia. In conventional research, EOG gaze estimation is problematic in the sense that object grasping inconsistency fails due to errors and the inability to specify objects with high spatial accuracy. This paper introduces a new method of object recognition based on camera images to compensate for the EOG gaze estimation errors and achieve stable object grasping. The proposed system consists of two cameras installed on the top and side of the robot arm, which are viewed using a computer display. In the investigation experiment with five subjects, it is found that the error of EOG gaze estimation in the proposed system configuration ranges between 1.8 cm and 3.0 cm. Next, two threshold values based on EOG gaze estimation errors from the first experiment are set. Finally, an object-grasping experiment with a robot arm using a control input combining camera object recognition and EOG gaze estimation is conducted. As a result, when the threshold value is set to a medium error value of 2.0 cm, the robot arm fails to specify the object, taking 71 s on average to grasp the object. On the other hand, when the maximum error value of 3.0 cm is set as the threshold, the object selection is stable, and the average time is 52 s, i.e., the grasping time is 27% faster than the case where the medium error value is set as the threshold. In conclusion, it is proven that the introduction of camera object recognition can compensate for the error of the EOG gaze estimation and realize precise object grasping. In contrast, the conventional EOG gaze estimation has error values of 1.8 cm to 3.0 cm. There are advantages and disadvantages for the proposed system based on these results. The advantages of the system are as follows: (1) the system enables the remote controlling of a robot arm using the EOG method, (2) the implementation of the simple image processing method improves the EOG gaze estimation by targeting the center point of the object; and (3) the robot gripper is able to grab the target object successfully in order to conduct displacement tasks. As for the disadvantages, (1) the user is constrained to stay still and positioned 0.35 m from the computer display and (2) the condition of the captured image from camera can affect the accuracy of the image processing, e.g., the image brightness from room lighting and image pixel quality.

Institutional Review Board Statement:
The study was conducted in line with the guidelines of the Declaration of Helsinki and approved by the Gifu University (approval procedure number 2020), issued by the Gifu University Ethics Committee.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy restrictions.

Conflicts of Interest:
The authors declare no conflict of interest.